Competition Between Reinforcement Learning Methods in a Predator-Prey Grid World
نویسنده
چکیده
Tabular and linear function approximation based variants of Monte Carlo, temporal difference, and eligibility trace based learning methods are compared in a simple predator-prey grid world from which the prey is able to escape. These methods are compared both in terms of how well they lead a prey agent to escape randomly moving predators, and in terms of how well they do in competition with each other when one agent controls the prey and each of the predators is controlled by a different type of agent. Results show that tabular methods, which must use a partial state representation due to the size of the full state space, actually do surprisingly well against linear function approximation methods, which can make use of a full state representation and generalize their behavior across states.
منابع مشابه
Dynamics of an eco-epidemic model with stage structure for predator
The predator-prey model with stage structure for predator is generalized in the context of ecoepidemiology, where the prey population is infected by a microparasite and the predator completely avoids consuming the infected prey. The intraspecific competition of infected prey is considered. All the equilibria are characterized and the existence of a Hopf bifurcation at the coexistence equilibriu...
متن کاملDYNAMIC COMPLEXITY OF A THREE SPECIES COMPETITIVE FOOD CHAIN MODEL WITH INTER AND INTRA SPECIFIC COMPETITIONS
The present article deals with the inter specific competition and intra-specific competition among predator populations of a prey-dependent three component food chain model consisting of two competitive predator sharing one prey species as their food. The behaviour of the system near the biologically feasible equilibria is thoroughly analyzed. Boundedness and dissipativeness of the system are e...
متن کاملStability analysis of a fractional order prey-predator system with nonmonotonic functional response
In this paper, we introduce fractional order of a planar fractional prey-predator system with a nonmonotonic functional response and anti-predator behaviour such that the adult preys can attack vulnerable predators. We analyze the existence and stability of all possible equilibria. Numerical simulations reveal that anti-predator behaviour not only makes the coexistence of the prey and predator ...
متن کاملMulti-Agent Model-Based Reinforcement Learning Experiments in the Pursuit Evasion Game
This paper describes multi-agent learning experiments performed on tactical sequences of the pursuit evasion game on very small grids. It underlines the performance difference between a centralized approach and a distributed approach when using Rmax, a model-based reinforcement learning algorithm. The prey’s goal is to go out of the grid and the predators’ goal is to kill the prey. The prey may...
متن کاملLIMITED GROWTH PREY MODEL AND PREDATOR MODEL USING HARVESTING
In this paper, we have proposed a study on controllability and optimal harvestingof a prey predator model and mathematical non linear formation of the equation equilibriumpoint of Routh harvest stability analysis. The problem of determining the optimal harvestpolicy is solved by invoking Pontryagin0s maximum principle dynamic optimization of theharvest policy is studied by taking the combined h...
متن کامل